Persistence diagrams are common descriptors of the topological structure of data appearing in various classification and regression tasks. They can be generalized to Radon measures supported on the birth-death plane and endowed with an optimal transport distance. Examples of such measures are expectations of probability distributions on the space of persistence diagrams. In this paper, we develop methods for approximating continuous functions on the space of Radon measures supported on the birth-death plane, as well as their utilization in supervised learning tasks. Indeed, we show that any continuous function defined on a compact subset of the space of such measures (e.g., a classifier or regressor) can be approximated arbitrarily well by polynomial combinations of features computed using a continuous compactly supported function on the birth-death plane (a template). We provide insights into the structure of relatively compact subsets of the space of Radon measures, and test our approximation methodology on various data sets and supervised learning tasks.
translated by 谷歌翻译
The circular coordinates algorithm of de Silva, Morozov, and Vejdemo-Johansson takes as input a dataset together with a cohomology class representing a $1$-dimensional hole in the data; the output is a map from the data into the circle that captures this hole, and that is of minimum energy in a suitable sense. However, when applied to several cohomology classes, the output circle-valued maps can be "geometrically correlated" even if the chosen cohomology classes are linearly independent. It is shown in the original work that less correlated maps can be obtained with suitable integer linear combinations of the cohomology classes, with the linear combinations being chosen by inspection. In this paper, we identify a formal notion of geometric correlation between circle-valued maps which, in the Riemannian manifold case, corresponds to the Dirichlet form, a bilinear form derived from the Dirichlet energy. We describe a systematic procedure for constructing low energy torus-valued maps on data, starting from a set of linearly independent cohomology classes. We showcase our procedure with computational examples. Our main algorithm is based on the Lenstra--Lenstra--Lov\'asz algorithm from computational number theory.
translated by 谷歌翻译
Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test whether the regression coefficient for $X$ is non-zero. However, when the model is misspecified, the test may have poor power, for example when $X$ is involved in complex interactions, or lead to many false rejections. In this work we study the problem of testing the model-free null of conditional mean independence, i.e. that the conditional mean of $Y$ given $X$ and $Z$ does not depend on $X$. We propose a simple and general framework that can leverage flexible nonparametric or machine learning methods, such as additive models or random forests, to yield both robust error control and high power. The procedure involves using these methods to perform regressions, first to estimate a form of projection of $Y$ on $X$ and $Z$ using one half of the data, and then to estimate the expected conditional covariance between this projection and $Y$ on the remaining half of the data. While the approach is general, we show that a version of our procedure using spline regression achieves what we show is the minimax optimal rate in this nonparametric testing problem. Numerical experiments demonstrate the effectiveness of our approach both in terms of maintaining Type I error control, and power, compared to several existing approaches.
translated by 谷歌翻译
超越地球轨道的人类空间勘探将涉及大量距离和持续时间的任务。为了有效减轻无数空间健康危害,数据和空间健康系统的范式转移是实现地球独立性的,而不是Earth-Reliance所必需的。有希望在生物学和健康的人工智能和机器学习领域的发展可以解决这些需求。我们提出了一个适当的自主和智能精密空间健康系统,可以监控,汇总和评估生物医学状态;分析和预测个性化不良健康结果;适应并响应新累积的数据;并提供对其船员医务人员的个人深度空间机组人员和迭代决策支持的预防性,可操作和及时的见解。在这里,我们介绍了美国国家航空航天局组织的研讨会的建议摘要,以便在太空生物学和健康中未来的人工智能应用。在未来十年,生物监测技术,生物标志科学,航天器硬件,智能软件和简化的数据管理必须成熟,并编织成精确的空间健康系统,以使人类在深空中茁壮成长。
translated by 谷歌翻译
空间生物学研究旨在了解太空飞行对生物的根本影响,制定支持深度空间探索的基础知识,最终生物工程航天器和栖息地稳定植物,农作物,微生物,动物和人类的生态系统,为持续的多行星寿命稳定。要提高这些目标,该领域利用了来自星空和地下模拟研究的实验,平台,数据和模型生物。由于研究扩展到低地球轨道之外,实验和平台必须是最大自主,光,敏捷和智能化,以加快知识发现。在这里,我们介绍了由美国国家航空航天局的人工智能,机器学习和建模应用程序组织的研讨会的建议摘要,这些应用程序为这些空间生物学挑战提供了关键解决方案。在未来十年中,将人工智能融入太空生物学领域将深化天空效应的生物学理解,促进预测性建模和分析,支持最大自主和可重复的实验,并有效地管理星载数据和元数据,所有目标使生活能够在深空中茁壮成长。
translated by 谷歌翻译